4: Compute the Standardize Precipitation Index based on spatial polygons


1 Introduction

In this tutorial, we extract the CHIRPS precipitation observations in Suriname, based on the administrative divisions of the Suriname Survey of Living Conditions for the years 2016/17 and 2022. Based on the extracted precipitation, the tutorial shows how to compute the Standardized Precipitation Index (SPI) and merge it with the survey based on the date of interviews.

2 Code

2.1 Set Up

We start by setting up the stage for our analysis.

First, we load the necessary packages. We load only climatic4economist package that contains several functions meant to extract and merge spatial variables with surveys. During the tutorial we will use other packages but instead of loading all the package at the begging we will call specific function each time.

library(climatic4economist)

In the setup, we also want to create the paths to the various data sources and load the necessary functions for extraction. Note .. means one step back to the folder directory, i.e. one folder back.

Note that how to set up the paths depends on your folder organization but there are overall two approaches:

  1. you can use the R project, by opening the project directly you don’t need to set up the path to the project. Automatically the project figures out on its own where it is located in the computer and set that path as working folder.
  2. you can manually set the working folder with the function setwd().
# path to data folder
path_to_data <- file.path("..",
                          "..", "data")

# survey 
path_to_wave_1 <- file.path(path_to_data, "survey", "surname", "wave 1",
                            "RT001_Public.dta")

path_to_wave_2 <- file.path(path_to_data, "survey", "surname", "wave 2", 
                            "2022 RT001_Housing_plus.dta")

# administrative division
path_to_adm_div <- file.path(path_to_data, "adm_div", "GAUL")

# weather variables
path_to_pre_monthly <- file.path(path_to_data, "weather", "CHIRPS", "monthly",
                                 "chirps-v2.0.monthly.nc")
# to result folder
path_to_result <- file.path(path_to_data, "result")
1
concatenate the string to make a path
2
.. means one folder back


2.2 Read the data

2.2.1 Survey

We begin by reading the surveys, which in this case consist of two waves with potentially different locations. As a result, we need to load both waves. The waves are stored as dta files, so we use the haven::read_dta() function to read them.

We only need the hhid, the survey coordinates, and the interview dates. We use dplyr::select() to choose these variables. This passage is optional and we bring with us all the variables, but we won’t use them. Note that the first wave does not include the interview date.

We combine the two waves using dplyr::bind_rows().

We can use the head() function to preview the data and see how it looks.

wave_1 <- haven::read_dta(path_to_wave_1) |>
    dplyr::select(hhid, lat_cen, long_cen) |> 
    dplyr::mutate(wave = 1)

wave_2 <- haven::read_dta(path_to_wave_2) |>
    dplyr::select(hhid, end_date_n, lat_cen, long_cen) |>
    dplyr::mutate(wave = 2)

survey <- dplyr::bind_rows(wave_1, wave_2)

head(survey)
# A tibble: 6 × 5
     hhid lat_cen long_cen  wave end_date_n
    <dbl>   <dbl>    <dbl> <dbl> <date>    
1 1010031    5.82    -55.2     1 NA        
2 1010041    5.82    -55.2     1 NA        
3 1010051    5.82    -55.2     1 NA        
4 1010061    5.82    -55.2     1 NA        
5 1010121    5.82    -55.2     1 NA        
6 1010131    5.82    -55.2     1 NA        

2.2.2 Aministrative Divisions

We read the spatial file containing the national borders of Suriname, we use read_GAUL() to load it. By printing the spatial data, we can obtain key information, such as the dimensions (number of rows and variables), the geometry (which indicates the type of spatial object), and the coordinate reference system (CRS), which links the coordinates to precise locations on the Earth’s surface. The CRS is particularly important when working with different spatial datasets, as mismatched CRSs can prevent the datasets from aligning correctly.

adm_div <- read_GAUL(path_to_adm_div, iso = "SUR", lvl = 2)
adm_div
 class       : SpatVector 
 geometry    : polygons 
 dimensions  : 62, 4  (geometries, attributes)
 extent      : -58.04987, -53.97982, 1.83114, 6.004546  (xmin, xmax, ymin, ymax)
 source      : GAUL-SUR-ADM2.geojson
 coord. ref. : lon/lat WGS 84 (EPSG:4326) 
 names       : ID_adm_div   iso  adm_div_1  adm_div_2
 type        :      <chr> <chr>      <chr>      <chr>
 values      :          1   SUR Brokopondo  Brownsweg
                        2   SUR Brokopondo    Centrum
                        3   SUR Brokopondo Klaaskreek

2.2.3 Weather

Finally, we load the precipitation data. Climatic data typically comes in the form of raster data. A raster represents a two-dimensional image as a rectangular matrix or grid of pixels. These are spatial rasters because they are georeferenced, meaning each pixel (or “cell” in GIS terms) represents a square region of geographic space. The value of each cell reflects a measurable property (either qualitative or quantitative) of that region. In this case, the values are monthly precipitation that fell in that region. We use the function terra::rast() to load the raster data.

This particular raster has global coverage, so we crop it to focus on the country area to reduce its size. Although this step is not strictly necessary, it helps decrease the memory load and makes visualizations more manageable. We use the function crop_with_buffer() for this purpose.

When we print the raster, we obtain several key details. The dimension tells us how many cells the raster consists of and the number of layers, each layer corresponds to a particular months for which the observations were made. We also get the spatial resolution, which defines the size of each square region in geographic space, and the coordinate reference system (CRS). Given the importance of the CRS, we extract it using terra::crs() and save it for later use.

We also rename the raster layers to reflect the corresponding dates for each layer, as this is useful if we want to track the dates. We use terra::time() to extract the dates.

Note

Note that rasters can store time information in different ways, so it may not always be possible to retrieve dates in this manner. A common alternative is for dates to be embedded in the layer names, in which case we wouldn’t need to rename the layers.

weather_monthly <- terra::rast(path_to_pre_monthly) |>
    crop_with_buffer(adm_div)
weather_monthly
class       : SpatRaster 
dimensions  : 83, 81, 527  (nrow, ncol, nlyr)
resolution  : 0.05, 0.05  (x, y)
extent      : -58.05, -54, 1.85, 6  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs 
source(s)   : memory
varname     : precip (Climate Hazards group InfraRed Precipitation with Stations) 
names       :  precip_1,  precip_2,  precip_3, precip_4, precip_5,   precip_6, ... 
min values  :  36.21083,  49.09035,  34.04647, 153.5361, 103.0583,   90.83206, ... 
max values  : 403.70178, 653.45990, 345.78024, 622.1295, 795.1603, 1034.43372, ... 
unit        :  mm/month,  mm/month,  mm/month, mm/month, mm/month,   mm/month, ... 
time (days) : 1981-01-01 to 2024-11-01 
names(weather_monthly) <- terra::time(weather_monthly)
weather_monthly
class       : SpatRaster 
dimensions  : 83, 81, 527  (nrow, ncol, nlyr)
resolution  : 0.05, 0.05  (x, y)
extent      : -58.05, -54, 1.85, 6  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs 
source(s)   : memory
varname     : precip (Climate Hazards group InfraRed Precipitation with Stations) 
names       : 1981-01-01, 1981-02-01, 1981-03-01, 1981-04-01, 1981-05-01, 1981-06-01, ... 
min values  :   36.21083,   49.09035,   34.04647,   153.5361,   103.0583,   90.83206, ... 
max values  :  403.70178,  653.45990,  345.78024,   622.1295,   795.1603, 1034.43372, ... 
unit        :   mm/month,   mm/month,   mm/month,   mm/month,   mm/month,   mm/month, ... 
time (days) : 1981-01-01 to 2024-11-01 


2.3 Georeference the survey

As we’ve mentioned, the weather data is georeferenced, so we need to ensure the same for the survey data. Since many households share the same coordinates, they are linked to the same weather events. To reduce computation time, we extract data only for the unique coordinates, rather than for each household. Moreover, we must ensure that we can later associate the correct weather data with the right household, we do this by creating an merging variable called ID.

This is handled by the prepare_coord() function, which requires the coordinates’ variable names as input.

We can print the result to check the transformation. The new column, ID, is created by prepare_coord() and identifies each unique coordinate. This is used to merge the weather data with the household data.

srvy_coord <- prepare_coord(survey, lat_var = lat_cen, lon_var = long_cen)
srvy_coord
# A tibble: 4,535 × 6
   ID           hhid lat_cen long_cen  wave end_date_n
   <chr>       <dbl>   <dbl>    <dbl> <dbl> <date>    
 1 1     23903250101    3.85    -54.2     2 2022-12-06
 2 1     23903250201    3.85    -54.2     2 2022-12-06
 3 1     23903250301    3.85    -54.2     2 2022-12-06
 4 1     23903250401    3.85    -54.2     2 2022-12-07
 5 1     23903252201    3.85    -54.2     2 2022-12-07
 6 1     23903252301    3.85    -54.2     2 2022-12-06
 7 1     23903252501    3.85    -54.2     2 2022-12-06
 8 1     23903252601    3.85    -54.2     2 2022-12-06
 9 1     23903253301    3.85    -54.2     2 2022-12-07
10 1     23903253501    3.85    -54.2     2 2022-12-06
# ℹ 4,525 more rows

Once we have the unique coordinates, we are ready to transform them into spatial points using the georef_coord() function. When performing this transformation, it’s crucial to set the correct CRS, which must match that of the weather data. The CRS is provided as an argument of the function, using the previously saved CRS from the weather data. Also the georef_coord() function requires the coordinates’ variable names as input.

srvy_geo <- georef_coord(srvy_coord,
                          geom = c("long_cen", "lat_cen"),
                          crs = "EPSG:4326")
srvy_geo
 class       : SpatVector 
 geometry    : points 
 dimensions  : 688, 1  (geometries, attributes)
 extent      : -57.05207, -54.05524, 3.849064, 5.943465  (xmin, xmax, ymin, ymax)
 coord. ref. : lon/lat WGS 84 (EPSG:4326) 
 names       :    ID
 type        : <chr>
 values      :     1
                   2
                   3


2.4 PLot

A good practice when working with spatial data is to plot it. This is the best way to verify that everything is working as expected.

First, we plot the survey coordinates to ensure they are correctly located within the country and to examine their spatial distribution.

terra::plot(adm_div, col = "grey", main = "Suriname and PSU centroids")
terra::points(srvy_geo, col = "gold", alpha = 0.5, cex = 0.5)
srvy_geo |> 
    tidyterra::filter(ID %in% c("21", "22", "652")) |>
    terra::text("ID", halo = TRUE, hc = "blue", col = "white", hw = 0.2)

We confirm that the survey locations are within the country borders, which is great! We also observe that the spatial distribution of survey coordinates is neither random nor uniform; most are concentrated near the capital and along the coast.

However, some coordinates falls outside the administrative division, we highlights them with the blue halo. It may difficult to see them without the zoom , but it is worth notice that as it has consequences for which administrative division to associate them.

In many microeconomics applications some coordinates fall outside the national order of the country, this might happen for various reasons. Sometimes, the coordinates are modified to guarantee anonimacy of the household and the modification bring the coordinates outside the borders, other time it is just an error of the GPS, or the borders maps are not enought precise.

Next, we plot a layer of the precipitation data to see how it overlaps with the spatial coordinates.

terra::plot(weather_monthly, "2024-10-01", col = terra::map.pal("water"),
            main = "Monthly precipitation at 2024-10 and survey location")
terra::lines(adm_div, col = "white", lwd = 2)
terra::points(srvy_geo, col = "red", alpha = 0.5, cex = 0.5)

Once again, the survey coordinates align with the precipitation data, which is great! We can also observe the high spatial resolution of the CHIRPS dataset. However, despite this high resolution, some survey coordinates still fall within the same cell.


2.5 Merge administrative division and survey

We want to associated the survey location to the administrative divisions. We do it by looking in which administrative division each survey location fall in. We save this information for later use.

srvy_adm_div <- get_poly_attr_for_point(point = srvy_geo, poly = adm_div)
Some spatial points are outside the polygon. They will be assigned to the closest polygon.
These are the points: 21 22 652 688
srvy_adm_div
# A tibble: 688 × 7
   ID      lon   lat ID_adm_div iso   adm_div_1  adm_div_2     
   <chr> <dbl> <dbl> <chr>      <chr> <chr>      <chr>         
 1 1     -54.2  3.85 55         SUR   Sipaliwini Tapanahony    
 2 2     -55.5  3.93 52         SUR   Sipaliwini Boven Suriname
 3 3     -55.5  4.00 52         SUR   Sipaliwini Boven Suriname
 4 4     -55.5  4.01 52         SUR   Sipaliwini Boven Suriname
 5 5     -55.5  4.01 52         SUR   Sipaliwini Boven Suriname
 6 6     -55.4  4.05 52         SUR   Sipaliwini Boven Suriname
 7 7     -54.8  4.06 55         SUR   Sipaliwini Tapanahony    
 8 8     -55.4  4.14 52         SUR   Sipaliwini Boven Suriname
 9 9     -55.4  4.16 52         SUR   Sipaliwini Boven Suriname
10 10    -55.2  4.18 6          SUR   Brokopondo Sarakreek     
# ℹ 678 more rows


2.6 Extraction

For weather data, we use a different function for extracting the data, namely extract_cell_by_poly(). This function doesn’t aggregate the values within the polygons but extract each all the cell values within the division separately. This is important as we want to compute the SPI for each cell and only later aggregate them.

Note

To extract each cells is more computationally and memory demanding, especially with large countries and long time series, but it increases precision as the aggregation, thus lost of information, is done at very last stage of the process.

Looking at the result, we see first the ID_adm_div column, that identifies the unique administrative divisions. The second and third column are the coordinates of the cells. The fourth is the amount of the cell that actually falls within the administrative division. The other columns contain the weather observations over time specific to each cell.

pre_cell <- extract_cell_by_poly(weather_monthly, adm_div)
pre_cell
# A tibble: 5,886 × 531
   ID_adm_div x_cell y_cell coverage_fraction X1981_01_01 X1981_02_01
   <chr>       <dbl>  <dbl>             <dbl>       <dbl>       <dbl>
 1 1           -55.3   5.12             0.332        171.        267.
 2 1           -55.2   5.12             0.633        182.        252.
 3 1           -55.2   5.12             0.699        207.        253.
 4 1           -55.1   5.12             0.765        212.        268.
 5 1           -55.1   5.12             0.491        193.        265.
 6 1           -55.3   5.07             0.553        177.        272.
 7 1           -55.2   5.07             1            194.        268.
 8 1           -55.2   5.07             1            217.        270.
 9 1           -55.1   5.07             1            218.        274.
10 1           -55.1   5.07             0.996        195.        266.
# ℹ 5,876 more rows
# ℹ 525 more variables: X1981_03_01 <dbl>, X1981_04_01 <dbl>,
#   X1981_05_01 <dbl>, X1981_06_01 <dbl>, X1981_07_01 <dbl>, X1981_08_01 <dbl>,
#   X1981_09_01 <dbl>, X1981_10_01 <dbl>, X1981_11_01 <dbl>, X1981_12_01 <dbl>,
#   X1982_01_01 <dbl>, X1982_02_01 <dbl>, X1982_03_01 <dbl>, X1982_04_01 <dbl>,
#   X1982_05_01 <dbl>, X1982_06_01 <dbl>, X1982_07_01 <dbl>, X1982_08_01 <dbl>,
#   X1982_09_01 <dbl>, X1982_10_01 <dbl>, X1982_11_01 <dbl>, …

2.7 Compute the SPI

We now compute the SPI with the function compute_spi(). This function requires the precipitation time series for each location and the time scale at which the SPI is computed.

To compute the SPI, it is recommended to use at least 30 years of observation to ensure a good estimation of the parameters. More years can strength the estimation but the results can be affected by climate change: if there have been a change in the climate parameters, old observations might be not indicative of the current situation affecting the estimation. There are no clear rule on this, so we leave add the possibility to select the time range of observation with the function select_by_dates(). The function requires both or just one between the starting date, from, and the end date to. If both are provide the the function select between the two dates, if only from is provided the function selects all date after, and if only to is provided the function selects all date before.

Looking at the result, we see first is the ID_adm_div column, that we will use to merge back with the survey. Then we have the coordinates of each cell and the coverage fraction of the cell within the administrate border. The other columns contain the SPI observations over time, specific to each coordinate.

# coord  <- select_by_dates(coord, from = "1991-01-01", to = "2023-01-01") 

spi3 <- compute_spi(pre_cell, time_scale = 3)
spi3
# A tibble: 5,886 × 531
   ID_adm_div x_cell y_cell coverage_fraction X1981_01_01 X1981_02_01
   <chr>       <dbl>  <dbl>             <dbl>       <dbl>       <dbl>
 1 1           -55.3   4.82          0.000447          NA          NA
 2 1           -55.3   4.77          0.617             NA          NA
 3 1           -55.3   4.82          0.915             NA          NA
 4 1           -55.3   4.87          0.762             NA          NA
 5 1           -55.3   4.92          0.575             NA          NA
 6 1           -55.3   4.97          0.715             NA          NA
 7 1           -55.3   5.02          0.521             NA          NA
 8 1           -55.3   5.07          0.553             NA          NA
 9 1           -55.3   5.12          0.332             NA          NA
10 1           -55.2   4.77          0.369             NA          NA
# ℹ 5,876 more rows
# ℹ 525 more variables: X1981_03_01 <dbl>, X1981_04_01 <dbl>,
#   X1981_05_01 <dbl>, X1981_06_01 <dbl>, X1981_07_01 <dbl>, X1981_08_01 <dbl>,
#   X1981_09_01 <dbl>, X1981_10_01 <dbl>, X1981_11_01 <dbl>, X1981_12_01 <dbl>,
#   X1982_01_01 <dbl>, X1982_02_01 <dbl>, X1982_03_01 <dbl>, X1982_04_01 <dbl>,
#   X1982_05_01 <dbl>, X1982_06_01 <dbl>, X1982_07_01 <dbl>, X1982_08_01 <dbl>,
#   X1982_09_01 <dbl>, X1982_10_01 <dbl>, X1982_11_01 <dbl>, …

If we want to calculate the SPI with time scale equal to one, we just need to change the time_scale argument.


2.8 Agregate at the adiministrative divisions

We have computed the climatic parameters for each cells but we still need to aggregate them at the administrative divisions. The function agg_to_adm_div() can do it for us, be aware the the function aggregate by using the weighted mean, where the weights are provided by the coverage_fraction variable.

The argument match_col identifies which columns are aggregated. In this case we want to select all dates observation, so we define the pattern to look for as any number [0-9] repeated four time {4}.

In the results we lose the the information on the specific cells and we are left only with the administrative division id, ID_adm_div, and a single value of the climatic parameters for each locations.

spi3_adm <- agg_to_adm_div(spi3, match_col = "[0-9]{4}")

spi3_adm
# A tibble: 62 × 528
   ID_adm_div X1981_01_01 X1981_02_01 X1981_03_01 X1981_04_01 X1981_05_01
   <chr>            <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
 1 1                  NaN         NaN     0.539         0.839       0.783
 2 10                 NaN         NaN     0.484         0.936       0.378
 3 11                 NaN         NaN     0.322         1.22        0.171
 4 12                 NaN         NaN     0.462         1.19        0.510
 5 13                 NaN         NaN    -0.00298       1.06        0.683
 6 14                 NaN         NaN     0.0467        1.06        0.644
 7 15                 NaN         NaN     0.0284        1.05        0.445
 8 16                 NaN         NaN     0.187         0.800       0.464
 9 17                 NaN         NaN     0.190         0.912       0.454
10 18                 NaN         NaN     0.406         1.06        0.416
# ℹ 52 more rows
# ℹ 522 more variables: X1981_06_01 <dbl>, X1981_07_01 <dbl>,
#   X1981_08_01 <dbl>, X1981_09_01 <dbl>, X1981_10_01 <dbl>, X1981_11_01 <dbl>,
#   X1981_12_01 <dbl>, X1982_01_01 <dbl>, X1982_02_01 <dbl>, X1982_03_01 <dbl>,
#   X1982_04_01 <dbl>, X1982_05_01 <dbl>, X1982_06_01 <dbl>, X1982_07_01 <dbl>,
#   X1982_08_01 <dbl>, X1982_09_01 <dbl>, X1982_10_01 <dbl>, X1982_11_01 <dbl>,
#   X1982_12_01 <dbl>, X1983_01_01 <dbl>, X1983_02_01 <dbl>, …


2.9 Merge with survey

Now, we combine the extracted weather data with the survey data.

However, the surveys do not carry information on the administrative division we have used, therefore we need an additional step to provide this information. We calculated this link information before and save it as srvy_adm_div. We first merge the link information with the spatial extracted variables, the output is then merge with the survey. Note that the pipe command |> assumes that the left side is the first argument in the function, as it is not the case for us we need to specify it with y = _, where y is the name of the argument and _ refer to the previous merge.

We can see that the result has all the information we retained from the surveys, the information about the administrative divisions, and the new extracted spatial variables.

srvy_spi3_adm <- dplyr::inner_join(srvy_adm_div, spi3_adm, by = "ID_adm_div") |>
    merge_by_common(srvy_coord, y = _)
srvy_spi3_adm
1
merge adm info with spatial var
2
_ refers to the output of the previous merge
# A tibble: 4,535 × 539
   ID        hhid lat_cen long_cen  wave end_date_n   lon   lat ID_adm_div iso  
   <chr>    <dbl>   <dbl>    <dbl> <dbl> <date>     <dbl> <dbl> <chr>      <chr>
 1 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 2 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 3 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 4 1      2.39e10    3.85    -54.2     2 2022-12-07 -54.2  3.85 55         SUR  
 5 1      2.39e10    3.85    -54.2     2 2022-12-07 -54.2  3.85 55         SUR  
 6 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 7 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 8 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 9 1      2.39e10    3.85    -54.2     2 2022-12-07 -54.2  3.85 55         SUR  
10 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
# ℹ 4,525 more rows
# ℹ 529 more variables: adm_div_1 <chr>, adm_div_2 <chr>, X1981_01_01 <dbl>,
#   X1981_02_01 <dbl>, X1981_03_01 <dbl>, X1981_04_01 <dbl>, X1981_05_01 <dbl>,
#   X1981_06_01 <dbl>, X1981_07_01 <dbl>, X1981_08_01 <dbl>, X1981_09_01 <dbl>,
#   X1981_10_01 <dbl>, X1981_11_01 <dbl>, X1981_12_01 <dbl>, X1982_01_01 <dbl>,
#   X1982_02_01 <dbl>, X1982_03_01 <dbl>, X1982_04_01 <dbl>, X1982_05_01 <dbl>,
#   X1982_06_01 <dbl>, X1982_07_01 <dbl>, X1982_08_01 <dbl>, …

We are back at 4535, which matches the original survey.


2.10 Select based on the interview

Now that we have merged the SPI values with the survey, we can select just the relevant observations.

If we want to select just a subsets of observations we can use the select_by_dates() function. If we want to select based on the date of interview of the survey, we can use select_by_interview(). This last function requires the variable that contains the dates of interview and the interval to select based on the dates. The interval must be express in number of months or in number years. The wide argument specifies how the output should be reported, in wide with each time observation as separate columns, or long, with all observation in one column.

Note that current version of the select_by_interview() functions drops the observations with missing date of interview.

What is relevant depends on the particular application, but we can agree that we don’t want to assign weather observations that happened after the interviews, as these cannot influence the answers.

In this tutorial we select the 12 months before the interviews using the function select_by_interview(). The argument interview select the variable containing the date of interviews, and the argument interval defines how back in time the function needs to select the observations.

If there are missing observations for the date of interviews, the function warns us that these observations are dropped.

spi3_hh <- select_by_dates(srvy_spi3_adm, "2020-01-01", "2023-01-01") |>
    select_by_interview(interview = end_date_n,
                        interval = "1 year",
                        wide = FALSE)
Missing interview are dropped!
spi3_hh
# A tibble: 30,066 × 14
   ID        hhid lat_cen long_cen  wave end_date_n   lon   lat ID_adm_div iso  
   <chr>    <dbl>   <dbl>    <dbl> <dbl> <date>     <dbl> <dbl> <chr>      <chr>
 1 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 2 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 3 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 4 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 5 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 6 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 7 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 8 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
 9 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
10 1      2.39e10    3.85    -54.2     2 2022-12-06 -54.2  3.85 55         SUR  
# ℹ 30,056 more rows
# ℹ 4 more variables: adm_div_1 <chr>, adm_div_2 <chr>, date <chr>, value <dbl>


2.11 Save

The final step of the code is to save the result. In this case, we save it as a dta file using the haven::write_dta() function.

haven::write_dta(spi3_hh, file.path(path_to_result, "spi_3_adm.dta"))